Convex Banding of the Covariance Matrix.
نویسندگان
چکیده
We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings.
منابع مشابه
Graph-Guided Banding of the Covariance Matrix
Abstract Regularization has become a primary tool for developing reliable estimators of the covariance matrix in high-dimensional settings. To curb the curse of dimensionality, numerous methods assume that the population covariance (or inverse covariance) matrix is sparse, while making no particular structural assumptions on the desired pattern of sparsity. A highly-related, yet complementary, ...
متن کاملRegularized estimation of large covariance matrices
This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (logp)/n→ 0, and obtain explicit rates. The results are uniform over some fairly natural well-conditioned fam...
متن کاملEstimation of Large Covariance Matrices
This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (logp)/n→ 0, and obtain explicit rates. The results are uniform over some fairly natural well-conditioned fam...
متن کاملA new approach to Cholesky-based covariance regularization in high dimensions
In this paper we propose a new regression interpretation of the Cholesky factor of the covariance matrix, as opposed to the well-known regression interpretation of the Cholesky factor of the inverse covariance, which leads to a new class of regularized covariance estimators suitable for high-dimensional problems. Regularizing the Cholesky factor of the covariance via this regression interpretat...
متن کاملComputationally efficient banding of large covariance matrices for ordered data and connections to banding the inverse Cholesky factor
In this article, we propose a computationally efficient approach to estimate (large) p-dimensional covariance matrices of ordered (or longitudinal) data based on an independent sample of size n. To do this, we construct the estimator based on a k-band partial autocorrelation matrix with the number of bands chosen using an exact multiple hypothesis testing procedure. This approach is considerabl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of the American Statistical Association
دوره 111 514 شماره
صفحات -
تاریخ انتشار 2016